Parse Fitting and Prose Fixing: Getting a Hold on III-Formedness

نویسندگان

  • Karen Jensen
  • George E. Heidorn
  • Lance A. Miller
  • Yael Ravin
چکیده

Processing syntactically ill-formed language is an important mission of the EPISTLE system, lll-formed input is treated by this system in various ways. Misspellings are highlighted by a standard spelling checker; syntactic errors are detected and corrections are suggested; and stylistic infelicities are called to the user's attention. Central to the EPISTLE processing strategy is its technique of fitted parsing. When the rules of a conventional syntactic grammar are unable to produce a parse for an input string, this technique can be used to produce a reasonable approximate parse that can serve as input to the remaining stages of processing. This paper first describes the fitting process and gives examples of ill-formed language situations where it is called into play. We then show how a fitted parse allows EPISTLE to carry on its text-critiquing mission where conventional grammars would fail either because of input problems or because of limitations in the grammars themselves. Some inherent difficulties of the fitting technique are also discussed. In addition, we explore how style critiquing relates to the handling of ill-formed input, and how a fitted parse can be used in style checking.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parsing Heterogeneous Corpora with a Rich Dependency Grammar

Philologist: I need to parse Old French texts of different types (verse, prose, dialects etc.). Do I have to train separate parser models? Computational Linguist: You won’t lose much if you train the parser on all the data you have. P: I can’t do the training myself. What can I expect from existing parser models? C: If the training corpus contained 12th century verse texts, you are best prepare...

متن کامل

Integration of Syntactic, Semantic and Contextual Information in Processing Grammatically Ill-Formed Inputs

This paper describes an integrated method for processing grammatically i l l formed inputs We use partial parses of the input for recov ering from parsing failure In order to select partial parses appropriate for error recovery, cost and reward are assigned to them Cost and reward represent the badness and goodness of a partial parse, respectively The most appropriate partial parse is selected ...

متن کامل

Kronelope in Ulitskaya\'s Short Stories

The features of the composition of small prose by L. Ulitskaya (on the example of the stories "Bronka", "Happy", "Poor Relatives") are analyzed. Particular attention is paid to the features of picturing time in stories, to the past which is sometimes more important for understanding the motives and characters of L. Ulitskaya’s heroes than the present and even...

متن کامل

The Good, the Bad and the Ugly: Well-Formedness of Live Sequence Charts

The Life Sequence Chart (LSC) language is a conservative extension of the well-known visual formalism of Message Sequence Charts. An LSC specification formally captures requirements on the inter-object behaviour in a system as a set of scenarios. As with many languages, there are LSCs which are syntactically correct but insatisfiable due to internal contradictions. The authors of the original p...

متن کامل

Grammar Error Detection with Best Approximated Parse

In this paper, we propose that grammar error detection be disambiguated in generating the connected parse(s) of optimal merit for the full input utterance, in overcoming the cheapest error. The detected error(s) are described as violated grammatical constraints in a framework for ModelTheoretic Syntax (MTS). We present a parsing algorithm for MTS, which only relies on a grammar of well-formedne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • American Journal of Computational Linguistics

دوره 9  شماره 

صفحات  -

تاریخ انتشار 1983